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1  INTRODUCTION 

Although  the  Discrete  Hartley  Transform  (DHT)  has  been  around  for  many  years  (ref.  1.2 1 . 
considerable  new  interest  in  this  transform  has  been  generated  recently.  This  renewed  interest 
is  a  result  of  the  discovery  of  a  Fast  Hartley  Transform  (FHT)  algorithm(ref.3).  In  common 
with  the  Fast  Fourier  Transform  (FFT),  the  FHT  algorithm  computes  the  DHT  of  a  data 
sequence  of  N  elements  in  a  time  proportional  to  A  log 2 /V.  Early  work(ref.3)  indicates  the 
FHT  is  as  fast  or  faster  than  the  FFT,  inferring  the  FHT  is  a  more  efficient  substitute  for 
the  FFT  in  areas  such  as  spectral  analysis,  digital  processing,  and  convolution. 

The  signal  processing  scheme  for  the  Jindalee  Over  The  Horizon  Radar  (OTHR)  employs  the 
FFT  for  range  processing,  digital  beamforming,  and  Doppler  processing.  The  FFT  consti¬ 
tutes  a  significant  portion  of  the  total  processing  load.  Future  developments  in  operational 
OTH  radars  for  Australia  will  lead  to  an  increased  range  processing  load,  which  relies  almost 
exclusively  on  the  FFT,  hence  a  more  efficient  algorithm  is  of  interest. 

The  definition  of  the  DHT  is  given  in  Section  2,  along  with  a  summary  of  its  properties. 
In  Section  3  the  fast  DHT  and  DFT  transforms  are  described  and  comparison  made  of  the 
number  of  processing  steps  required  for  the  FFT  and  FHT  algorithms.  Section  -l  establishes 
the  suitability  of  these  two  algorithms  for  Jindalee  signal  processing,  and  conclusions  are 
made  in  Section  5. 


2  THE  HARTLEY  TRANSFORM 

Consider  a  sequence  of  N  real  numbers  xn  for  n  =  0,1 . V  -  1.  The  Discrete  Hartley 

Transform  of  this  sequence  is  defined(ref.3)  as 


/v-i 


27T7lI', 


,  2irnk , 


Hk  =  I,  r"  («w(— ; —)  +  sin{— — )] 


N 


1) 


k  =  0,1,...JV  -  1. 


This  is  often  written  as 


1  .2tr  nk 

=  v  L  *ncas(  — ) 

n=0 

k  =  0, 1,  ...N  -  1 


where  cas(0)  =  cos(0)  +  sin[6). 

The  inverse  DHT  of  a  sequence  of  N  real  numbers  H k  for  k  =  0, 1,  ...N  —  1  is  given  by 


*,  =  £//*  [cwf-2™*' 
*=0 


N 


(•'SI 


n  =  0,1,  ...N  -  1. 


The  DHT  has  a  number  of  interesting  properties,  that  can  be  more  readily  understood  by 
comparison  with  the  Discrete  Fourier  Transform  (DFT).  The  corresponding  expressions  for 
the  DFT  and  inverse  DFT  are 

N- 1 


r,  1  'V-1  r  ,2tr rik^  .  .  ,2™fc 
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k  =  0. 1. 

2irnk  .  .  2ir nk  , 

=  L  f*[C£,s(-tt)  +  J  sui(-Tr-)) 

k=0  iV  JV 

n  =  0, 1,  ...jV  —  1. 


From  equations  (2)  and  (3)  we  see  that  the  forward  and  inverse  DHTs  are  identical  (apart 
from  a  scaling  factor).  This  can  be  of  advantage  on  limited  memory  machines,  requiring  only 
one  algorithm  he  stored  in  program  memory,  rather  than  the  two  required  for  the  (inverse) 
DFT.  Equations  (-1)  and  (5)  show  that  the  forward  and  inverse  DFTs  differ  by  a  sign  change 
of  the  imaginary  part.  Also,  the  DHT  uses  real  arithmetic  only,  while  the  DFT  requires 
complex  arithmetic.  The  absence  of  complex  arithmetic  gives  the  DHT  the  appearance 
of  being  simpler.  It  will  be  shown,  however,  that  both  the  DHT  and  DFT  are  of  similar 
complexity. 

It  must  be  emphasised  that  the  DHT  and  DFT  are  distinct  transforms.  Both  offer  an 
alternative  way  of  representing  the  same  data.  The  DFT  representation  of  the  data  sequence 
xn  for  n  =  0,  I ,...iV  —  1  gives  us  amplitude  and  phase  information  on  sinusoidal  frequencies 
present  in  xn.  The  DHT  gives  the  same  information,  but  in  a  slightly  modified  form. 

From  equations  (1)  and  (4)  it  is  clear  that  the  Hartley  Transform  Hk  is  not  the  same  as  the 
Fourier  Transform  Ft.  As  it  is  the  Fourier  Transform  that  we  generally  desire,  the  Hartley 
Transform  is  only  of  use  if  we  can  readily  derive  the  Fourier  Transform  from  it. 

The  DFT  may  be  derived  from  the  DHT  as  follows.  Remembering  that  cos  is  an  even 
function  and  sin  is  an  odd  function,  inspection  of  equations  ( 1 )  and  (4)  reveals  that,  for  real 
x„  with  n  =  0. 1,...;V  -  1,  the  even  part  of  the  DHT  is  equivalent  to  the  real  part  of  the 
DFT,  and  the  odd  part  of  the  DHT  is  equivalent  to  the  negative  of  the  imaginary  part  of 
the  DFT.  Thus  the  DFT  can  be  derived  from  the  DHT  by  y  add  operations  and  ‘j-  subtract 
operations  (ignoring  scaling  by  2).  Noting  that  H0  =  Hn  and  F0  =  Fv,  mathematically  we 
can  express  the  required  relationships  as 

neal{Fk)  =  ^[Hk  +  HN.k]  (fi) 


with  lZeal{Fk}  symmetrical  about  Fv/ 2  >e  K{Fk}  =  TZ{F,\-k}. 


Imag{Fk }  =  -\^{Hk  -  HN^k\ 


n  =  0, 1, 


jV 
"  2 


-  1 


with  lmag{Fk}  anti-symmetric  about  Fn/2  i el{Fk}  =  -l{FN-k}. 


From  the  above  it  should  be  evident  that  for  real  xn  with  n  =  0,  1,...jV  —  1,  to  determine  the 
DFT  Fk  for  k  =  0, 1,  ...N  -  1  via  the  DHT,  it  is  necessary  to  calculate  the  Hartley  Transform 
Hk  for  k  =  0,  l,...Ar  -  1.  There  exist  redundancies,  however,  in  the  DFT.  ft  can  be  readily 
seen  that  Fk  =  Fjij_k,  so  it  is  only  necessary  to  determine  Fk  for  k  =  0,  l,...y  —  1.  ie:  only 
half  of  the  Fourier  Transform  need  be  calculated  for  real  i„.  This  redundancy  balances  with 
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the  DHT’s  absence  of  complex  arithmetic,  making  both  transforms  of  similar  complexity  (in 
terms  of  the  number  of  arithmetic  steps  required). 

The  DFT  can  be  performed  on  a  sequence  of  complex  data  (ie:  x„  complex  in  (4))  and  the 
transform  will  be  of  the  same  length  and  generally  will  also  be  complex.  In  contrast,  the 
DHT  can  only  be  performed  on  a  real  data  sequence  (i„  real  in  (2)),  the  transform  also 
being  a  real  data  sequence  of  the  same  length.  We  have  the  problem  then:  how  do  we  use 
the  Hartley  algorithm  to  transform  a  complex  sequence? 

We  can  determine  the  Fourier  Transform  of  a  complex  data  sequence  via  the  Hartley  Trans¬ 
form  by  separately  transforming  the  real  and  imaginary  parts  of  the  complex  sequence,  and 
then  recombining  these  transforms.  These  steps  are  illustrated  mathematically  below. 

Consider  the  complex  sequence  zn  =  +  jyn  for  n  =  0,1,. ..iV  —  1  where  r„  and  yn  are  both 

real  sequences.  Denoting  the  Fourier  Transform  operator  by  F{  }  we  have,  by  linearity. 

Zk  =  T’{j„}  =  F{xn  +  jyn] 

=  F{Xn}  +  jF{yn} 

—  Zk  +  j\k  ( i> ) 

k  =  0, 1,  ...jV  -  1 

where  Xk,  Yk ,  and  Zk  are  the  Fourier  Transforms  of  rn,  yn ,  and  respectively.  The  DHT 
can  be  used  to  compute  A'*  and  Yk  from  xn  and  yn  in  the  manner  described  above.  Thus 
with  little  extra  complexity,  the  DHT  can  be  used  to  compute  the  DFT  for  real  or  complex 
data  (complex  data  requiring  two  distinct  Hartley  Transforms  be  performed). 


3  PRINCIPLES  OF  FAST  DHT  AND  DFT  ALGORITHMS 

While  there  exist  many  applications  for  the  Discrete  Fourier  Transform  (eg  spectral  analysis, 
correlation,  convolution),  computing  the  DFT  via  the  definition  given  in  Section  2  requires  a 
considerable  amount  of  processing.  To  directly  transform  a  sequence  of  length  .V  requires  a 
number  of  arithmetic  operations  of  order  X2  ie  doubling  the  length  of  the  original  sequence 
results  in  a  four-fold  increase  in  the  processing  load.  For  large  X  this  method  becomes 
impractical,  requiring  enormous  processing  power.  In  1965  a  fast  alternative  method  of 
determining  the  DFT  was  reported(ref.4).  This  Fast  Fourier  Transform  could  compute  the 
DFT  in  a  time  proportional  to  X  log2  X,  making  it  far  more  suitable  than  direct  calculation 
when  transforming  sequences  containing  many  points. 

The  development  of  the  Hartley  Transform  has  proceeded  in  an  analogous  way.  The  Discrete 
Hartley  Transform  was  first  defined  in  1942(ref.l).  Calculation  via  the  definition,  given 
in  Section  2,  executes  in  a  time  proportional  to  N2.  As  with  the  FFT,  a  Fast  Hartley 
Transform  has  been  defined(ref,3).  This  FHT  algorithm,  like  the  FFT,  also  executes  in  a  time 
proportional  to  N  log2  .'V,  generating  considerable  interest  since  its  discovery  by  Bracewell 
in  1984. 

It  is  instructive  to  analyse  the  way  in  which  these  fast  transforms  work.  We  will  look  at 
the  radix-2  FHT  and  FFT  algorithms.  Although  not  the  most  efficient  fast  algorithms,  the 
radix-2  transforms  are  the  most  widely  understood  fast  method  of  transforming  a  sequence 
of  length  X ,  with  X  =  2‘  :  i  integer. 
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3.1  THE  RADIX-2  FFT 

To  determine  the  Fourier  Transform  of  i„,  n  =  0,  V  —  1  and  N 
let 

yn  =  x2n 

*n  —  *^2n+l 

n  =  0,1,  ...j  -  1. 


The  sequences  yn  and  zn  are  each  of  length  -y,  and  have  transforms 

Yk  =  E  Vnt'W 


n= 0 


z*  =  E  *»< 


_ ,iaai 


n=0 


=  o,  i,...— 


The  transform  that  we  seek  is  A'*, 


,v-i 


v — '  _ ,  2znh 

\k  =  E  x*e  ,v 

n=0 

*  =  0, i. 

Expression  (11)  can  be  manipulated  to  give 

i — ,  2n(2n)k  _ 

AT  =  2^  J2„e  J  v  +  E  X2n+le  ; 

n=0 


2*(2n+l)fc 
S 


£-1 


so  that 
A  Iso 

or 


E_  ,  it  ns  _  ,  aus  v— '  —  > 

Vne  ‘v  +  e  *  E 

n=0  M=rO 


X*  =  i;  +  e-'Vz(. 


A,. ,  iv  =  Yk  +  e->*e'ji&Zk 


Xk+K  =  Yk-  e-Mzk. 
Expressions  (12)  and  (13)  are  often  written  as 

AT  =  Yk  +  WkZk 
Xk+K  =  Yk-  WkZk 


Wk  = 


with 
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The  pair  of  equations  (14)  and  (15)  is  often  referred  to  as  the  FFT  kernel  or  butterfly 
operation,  as  it  represents  the  fundamental  building  block  of  the  radix-2  FFT  algorithm.  The 
butterfly  operation  requires  y  complex  multiplications  and  Ar  complex  additions  to  compute 
the  sequence  AT  for  k  =  0,1,2...JV  —  1,  from  the  sequences  Yk  and  Z*,  k  =  0.  1.2. ..4- 

The  term  ‘radix-2’  is  due  to  AT  being  determined  from  the  two  transforms  Yk  and  Zk.  A 
large  number  of  different  radix  algorithms  have  been  tried,  including  radix-4  and  radix-S. 
Interestingly,  one  of  the  fastest  FFT  algorithms,  the  Split-Radix,  uses  a  hybrid  scheme  where 
the  length  N  DFT  is  computed  from  a  length  y  plus  two  length  4  DFTs. 

The  principle  of  the  radix  2  FFT  can  be  described  as  : 

1.  generate  length  A’  DFT  from  2  length  y  DFTs 

2.  generate  each  length  y  DFT  from  2  length  y  DFTs 


i.  generate  each  length  2  DFT  from  2  length  1  DFTs, 
-each  length  1  DFT  is  equal  to  itself. 


Each  of  the  above  t  steps  (i  =  log2  .V )  requires  y  complex  multiplications  and  .V  complex 
additions.  Thus  the  length  .V  FFT  requires  y  log2  A'  complex  multiplications  and  .V  log2  A 
complex  additions. 


3.2  THE  RADIX-2  FHT 

The  development  of  the  Fast  Hartley  Transform  proceeds  in  a  similar  way  to  the  FFT.  Sup¬ 
pose  we  wish  to  determine  the  Hartley  Transform  of  a  real  sequence  x„  for  n  =  0.  1.2....Y  -  1 
and  A’  =  2‘  with  Hinteger. 

Let 

y*  =  x2» 

zn  ~  ^2n+l 

w 

n  =  0, 1, ...—  —  1. 

The  real  sequences  yn  and  zn  are  each  of  length  hr  and  have  transforms 


T-4  4rr  nk 

I*  =  2^  yncas(  — y- ) 

n=0 
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•£t_> 

7  _  V'  -  ,^nk, 

Zk  ~  vnCfl,S(  \r  ) 

71  =  0  iV 
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k  =  0,1,...—  -  1. 


'A1  2tt  nk  .  2nnk. 

A k  -  2^  x„(co.s(— — ) +  s!n( 

n=0  ‘V 


.V 


We  seek  the  transform  AT, 


Expanding  gives 
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This  can  be  further  expanded  to  give 


4-i 


i- ,  -ItuF  '  47rnF  ,  2 -A  .  Iztik  2zk 

A*  =  2L,  !/-ifu-s(— —)  +  2-  -n[cos(— —  )cos(— )  -  >'”(—■ —  )sm(  —  ) 


Hence 


and 


l;rnA  2rrA  4  7mA  ‘ink 

+--»i(-  )ros(— — )  +  cos( — rr-).s!n( 

,  2jtA-  2*  A 

A*.  =  >*  +  [ros(-^-)Z*  +  sin(— ^-)Z\-r] 

A\.+.s  =  >»  -  [cosf-^JZ*  +si«(:^)Z.v_*]. 


Equations  (16)  and  (17)  form  the  basis  of  the  FHT  algorithm.  As  with  the  EFT.  the  FH  1 
calculates  the  length  A’  transform  from  two  length  A  transforms.  For  the  radix-2  algorithms 
considered  here,  both  the  FHT’  and  FFT  require  the  same  number  of  butterflies. 

We  see  that  the  FHT  butterfly  requires  two  real  multiplications  and  three  real  additions, 
where  the  FFT  butterfly  requires  one  complex  multiplication  and  two  complex  additions. 
Reasoning  in  the  same  way  as  for  the  FFT.  it  can  be  shown  that  to  calculate  an  .V  point 
FHT  requires  ,Vlog2.V  real  multiplications  and  |.Vlog2.V  real  additions  in  the  form  of 
butterflies.  As  described  in  section  2,  further  operations  of  order  .V  will  also  be  required  to 
compute  the  DFT  from  this  Hartley  Transform. 

Comparing  the  execution  speed  of  the  FHT  and  FFT  is  not  straight  forward.  1  he  FF1 
butterfly  requires  the  equivalent  of  four  real  multiplies  and  G  real  adds,  twice  as  many  a> 
for  the  FHT  butterfly,  but  the  FHT  can  only  be  used  for  real  data.  Generally  twice  as 
many  FHT  butterflies  will  be  required  for  a  complex  transform,  so  that  both  the  I'll  I  and 
FFT  require  the  same  number  of  real  arithmetic  operations  from  butterflies  alone.  Any 
advantages  that  one  may  have  over  the  other  will  be  of  order  .V.  For  large  .V  this  difference 
of  order  ,V  will  become  relatively  less  significant,  when  compared  with  the  total  number  of 
arithmetic  operations  (which  is  of  order  iV  log2  N).  The  issue  of  execution  speed  comparison 
is  complicated  further  by  the  existence  of  more  efficient  FFT  and  FHT  algorithms.  In 
particular,  an  alternative  real  valued  FFT  (RFFT)  exists(ref.o)  that  is  shown  to  be  faster 
than  any  known  FHT  algorithm. 

It  is  easier  to  compare  the  FFT  and  FHT  with  a  specific  application  in  mind,  file  next 
section  looks  at  the  specific  application  of  the  FHT  and  FFT  to  Jindalee  signal  processing, 
where  purpose-built  computer  hardware  is  used  to  increase  processing  throughput. 
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4  A i ’PLICATION  OF  FHT/FFT  TO  JINDALEE  SIGNAL  PROCESSING 

i  his  section  looks  at  the  suitability  of  the  FHT/FFT  for  current  .Jindalee  signal  processing, 
and  also  the  implications  for  hardware  design  in  future  Jindalee  radars.  The  following 
comparisons  assume  a  radix-2  FHT  and  FFT.  This  is  justified  as  it  has  been  shown!  rcf.b.7  i 
that  due  to  the  similarity  of  the  algorithms,  any  optimisation  applied  to  one  can  also  lie 
applied  to  the  other  with  the  same  speed  improvement. 

Signal  processing  for  the  Jindalee  OTHR(ref.S)  relies  heavily  on  the  FFT.  It  is  used  for 
ranging,  where  radar  returns  are  separated  into  range  bins:  digital  beamforming,  where 
radar  returns  are  separated  into  azimuthal  ‘finger  beams':  and  Doppler  processing,  where 
targets'  radial  speeds  allows  them  to  be  separated  from  the  large  land/sea  backscatter  return. 

In  order  to  meet  the  required  FFT  load.  Arithmetic  Oriented  (ARO)  array  processors  were 
designed  and  built(ref.D)  at  ERL.  The  hardware  of  the  ARO  processor  is  optimised  around 
the  FFT  butterfly  operation,  with  a  hardware  multiplier  and  two  hardware  adders  operating 
in  parallel  (the  radix-2  FF  I'  butterfly  requires  a  complex  multiply  and  two  complex  addsi. 
By  employing  a  degree  of  pipe-lining,  each  of  these  arithmetic  units  can  perform  a  complex 
operation  in  210  ns.  or  a  real  operation  in  320  ns.  The  combination  of  these  parallel  arith¬ 
metic  units  allows  a  complex  butterfly  to  be  computed  in  240  ns  (three  microcycles  of  the 
arithmetic  processor  ( A P ) ) .  It  actually  takes  27  AP  microcycles  to  perform  a  butterfly  from 
start  to  finish.  The  pipe-lining,  however,  allows  a  butterfly  to  he  completed  every  210  ns. 

It  can  he  readily  demonstrated  t fiat  the  ARO  processor  is  not  suited  to  calculating  the 
FHT  efficiently.  Consider  the  FHT  butterfly,  requiring  two  real  multiplications  and  three 
real  adds.  The  butterfly  could  not  be  performed  in  less  than  180  ns  (assuming  the  two  real 
multiplications  are  performed  as  complex  operations)  and  the  FHT  is  capable  of  transforming 
real  data  only.  Jindalee  signal  processing  requires  a  complex  transform  be  done  ie  two 
separate  Hartley  Transforms  are  required  (as  explained  in  Section  2).  It  is  apparent  the 
FFT  algorithm  will  execute  approximately  four  times  faster  than  the  FHT  on  the  ARO 
processor  (2  FHTs  against  1  FFT  and  butterfly  time  at  least  twice  as  long  for  the  Fil  l  ). 
Clearly  it  would  not  be  practical  to  implement  the  F I  IT  on  the  ARO. 

Future  Jindalee  radars  will  have  an  increased  range  and  Doppler  processing  FF  I'  load.  The 
digital  beamforming  load  will  also  increase,  but  this  is  not  likely  to  be  done  via  FF  I'.  For 
this  increased  signal  processing  load,  faster  arithmetic  processors  will  be  required.  \Yc  now 
look  briefly  at  the  suitability  of  designing  a  new  processor  around  the  FHT  rather  than  the 
FFT. 

To  perform  a  real  or  complex  Fourier  Transform  requires  almost  the  same  number  of  real 
operations  for  both  the  FHT  and  FFT.  Also,  due  to  the  similarity  of  the  two  fast  algorithms, 
neither  appears  to  be  more  suited  than  the  other  in  terms  of  ease  of  arithmetic  hardware 
design,  ft  is  therefore  concluded  that  the  Fast  Hartley  Transform  offers  no  advantages  over 
the  Fast  Fourier  Transform  for  the  design  of  any  future  signal  processor. 

To  increase  FFT  through-put  there  are  other  areas  that  can  be  addressed.  The  ARO  pro¬ 
cessor  implements  a  radix-2  FFT.  There  exist  FFT  algorithms,  such  as  the  split-radix,  that 
execute  in  about  half  the  time(ref.6).  Discussion  with  colleagues  indicates  a  hardware  limi¬ 
tation  prevents  the  use  of  a  faster  FFT  on  the  ARO. 

A  further  limitation  of  the  ARO  processor  is  the  lack  of  hardware  ‘bit -reversal'.  Bit  -reversal  is 
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required  to  unscramble  data  prior  to  or  after  performing  an  FFT.  This  bit- reversal  is  currently 
done  in  software  and  considerably  slows  the  FFT  routine.  These  hardware  inadequacies, 
plus  factors  such  as  vector  set-up  times,  become  more  dominant  for  transforms  of  short  data 
sequenc  o,  and  should  be  taken  into  account  when  designing  a  new  processor. 

5  CONCLUSION 

Despite  much  recent  interest,  there  appears  to  be  no  significant  benefit  in  using  the  Fast 
Hartley  Transform  in  place  of  the  Fast  Fourier  Transform.  The  FHT  requires  a  comparable 
number  of  steps  to  execute  and  is  of  comparable  complexity  to  the  FFT.  The  FHT  does 
have  the  advantage  that  the  forward  and  inverse  transforms  are  the  same,  but  this  is  only  of 
ad  vantage  on  a  limited  memory  machine. 

For  any  future  arithmetic  processor,  improved  FFT  performance  can  be  achieved  by  address¬ 
ing  the  ARO  hardware  limitations.  In  particular 

—  Base  the  arithmetic  hardware  around  the  split-radix  rather  than 
radix-2  butterfly  (or  perhaps  design  the  hardware  so  that  any 
new  algorithms  can  be  readily  microcoded  in  the  future) 

—  Implement  hardware  bit-reversal 
Reduce  vector  set-up  times 
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